智能论文笔记

Interpretable ML for Imbalanced Data

Damien A. Dablain , Colin Bellinger , Bartosz Krawczyk , David W. Aha , Nitesh V. Chawla

分类：机器学习

2022-12-15

Deep learning models are being increasingly applied to imbalanced data in high stakes fields such as medicine, autonomous driving, and intelligence analysis. Imbalanced data compounds the black-box nature of deep networks because the relationships between classes may be highly skewed and unclear. This can reduce trust by model users and hamper the progress of developers of imbalanced learning algorithms. Existing methods that investigate imbalanced data complexity are geared toward binary classification, shallow learning models and low dimensional data. In addition, current eXplainable Artificial Intelligence (XAI) techniques mainly focus on converting opaque deep learning models into simpler models (e.g., decision trees) or mapping predictions for specific instances to inputs, instead of examining global data properties and complexities. Therefore, there is a need for a framework that is tailored to modern deep networks, that incorporates large, high dimensional, multi-class datasets, and uncovers data complexities commonly found in imbalanced data (e.g., class overlap, sub-concepts, and outlier instances). We propose a set of techniques that can be used by both deep learning model users to identify, visualize and understand class prototypes, sub-concepts and outlier instances; and by imbalanced learning algorithm developers to detect features and class exemplars that are key to model performance. Our framework also identifies instances that reside on the border of class decision boundaries, which can carry highly discriminative information. Unlike many existing XAI techniques which map model decisions to gray-scale pixel locations, we use saliency through back-propagation to identify and aggregate image color bands across entire classes. Our framework is publicly available at \url{https://github.com/dd1github/XAI_for_Imbalanced_Learning}

translated by 谷歌翻译

Towards A Holistic View of Bias in Machine Learning: Bridging Algorithmic Fairness and Imbalanced Learning

Damien Dablain , Bartosz Krawczyk , Nitesh Chawla

分类：机器学习

2022-07-13

机器学习（ML）在渲染影响社会各个群体的决策中起着越来越重要的作用。 ML模型为刑事司法的决定，银行业中的信贷延长以及公司的招聘做法提供了信息。这提出了模型公平性的要求，这表明自动化的决策对于受保护特征（例如，性别，种族或年龄）通常是公平的，这些特征通常在数据中代表性不足。我们假设这个代表性不足的问题是数据学习不平衡问题的必然性。此类不平衡通常反映在两个类别和受保护的功能中。例如，一个班级（那些获得信用的班级）对于另一个班级（未获得信用的人）可能会过分代表，而特定组（女性）（女性）的代表性可能与另一组（男性）有关。相对于受保护组的算法公平性的关键要素是同时减少了基础培训数据中的类和受保护的群体失衡，这促进了模型准确性和公平性的提高。我们通过展示这些领域中的关键概念如何重叠和相互补充，讨论弥合失衡学习和群体公平的重要性；并提出了一种新颖的过采样算法，即公平的过采样，该算法既解决偏斜的类别分布和受保护的特征。我们的方法：（i）可以用作标准ML算法的有效预处理算法，以共同解决不平衡和群体权益；（ii）可以与公平感知的学习算法结合使用，以提高其对不同水平不平衡水平的稳健性。此外，我们迈出了一步，将公平和不平衡学习之间的差距与新的公平实用程序之间的差距弥合，从而将平衡的准确性与公平性结合在一起。

translated by 谷歌翻译

Efficient Augmentation for Imbalanced Deep Learning

Damien Dablain , Colin Bellinger , Bartosz Krawczyk , Nitesh Chawla

分类：机器学习

2022-07-13

深度学习模型记住培训数据，这损害了他们推广到代表性不足的课程的能力。我们从经验上研究了卷积神经网络对图像数据不平衡数据的内部表示，并测量了训练和测试集中模型特征嵌入之间的概括差距，这表明该差距对于少数类别的差异更大。这个洞察力使我们能够为不平衡数据设计有效的三相CNN培训框架。该框架涉及训练网络端到端的数据不平衡数据以学习准确的功能嵌入，在学习的嵌入式空间中执行数据增强以平衡火车分布，并在嵌入式平衡的培训数据上微调分类器头。我们建议在培训框架中使用广泛的过采样（EOS）作为数据增强技术。 EOS形成合成训练实例，作为少数族类样本与其最近的敌人之间的凸组合，以减少概括差距。提出的框架提高了与不平衡学习中常用的领先成本敏感和重新采样方法的准确性。此外，它比标准数据预处理方法（例如SMOTE和基于GAN的过采样）更有效，因为它需要更少的参数和更少的训练时间。

translated by 谷歌翻译

Developing an NLP-based Recommender System for the Ethical, Legal, and Social Implications of Synthetic Biology

Damien Dablain , Lilian Huang , Brandon Sepulvado

分类：人工智能

2022-07-10

合成生物学是一个新兴领域，涉及有机体的工程和重新设计，例如粮食安全，健康和环境保护。因此，它对研究人员和政策制定者构成了许多道德，法律和社会影响（ELSI）。确保社会负责的合成生物学的各种努力正在进行中。政策制定是一条监管途径，其他举措则试图将社会科学家和伦理学家纳入合成生物学项目中。然而，鉴于合成生物学的疾病，它跨越的异质领域的数量以及许多道德问题的开放性质，它证明建立广泛的具体政策具有挑战性，包括合成生物学团队在内成功。本文提出了一种不同的方法，询问是否有可能根据自然语言处理（NLP）开发出良好的推荐模型，以将合成生物学家与有关其特定研究的ELSI信息联系起来？该推荐人是作为建立合成生物学知识系统（SBK）的较大项目的一部分开发的，以加速发现和探索合成生物学设计空间。我们的方法旨在提炼合成生物学家相关的伦理和社会科学信息，并将其嵌入合成生物学研究工作流程中。

translated by 谷歌翻译

Risk-Sensitive Policy with Distributional Reinforcement Learning

Thibaut Théate , Damien Ernst

分类：机器学习 | 人工智能

2022-12-30

Classical reinforcement learning (RL) techniques are generally concerned with the design of decision-making policies driven by the maximisation of the expected outcome. Nevertheless, this approach does not take into consideration the potential risk associated with the actions taken, which may be critical in certain applications. To address that issue, the present research work introduces a novel methodology based on distributional RL to derive sequential decision-making policies that are sensitive to the risk, the latter being modelled by the tail of the return probability distribution. The core idea is to replace the $Q$ function generally standing at the core of learning schemes in RL by another function taking into account both the expected return and the risk. Named the risk-based utility function $U$, it can be extracted from the random return distribution $Z$ naturally learnt by any distributional RL algorithm. This enables to span the complete potential trade-off between risk minimisation and expected return maximisation, in contrast to fully risk-averse methodologies. Fundamentally, this research yields a truly practical and accessible solution for learning risk-sensitive policies with minimal modification to the distributional RL algorithm, and with an emphasis on the interpretability of the resulting decision-making process.

translated by 谷歌翻译

Explainability as statistical inference

Hugo Henri Joseph Senetaire , Damien Garreau , Jes Frellsen , Pierre-Alexandre Mattei

分类：机器学习 | 人工智能

2022-12-06

A wide variety of model explanation approaches have been proposed in recent years, all guided by very different rationales and heuristics. In this paper, we take a new route and cast interpretability as a statistical inference problem. We propose a general deep probabilistic model designed to produce interpretable predictions. The model parameters can be learned via maximum likelihood, and the method can be adapted to any predictor network architecture and any type of prediction problem. Our method is a case of amortized interpretability models, where a neural network is used as a selector to allow for fast interpretation at inference time. Several popular interpretability methods are shown to be particular cases of regularised maximum likelihood for our general model. We propose new datasets with ground truth selection which allow for the evaluation of the features importance map. Using these datasets, we show experimentally that using multiple imputation provides more reasonable interpretations.

translated by 谷歌翻译

Value-based CTDE Methods in Symmetric Two-team Markov Game: from Cooperation to Team Competition

Pascal Leroy , Jonathan Pisane , Damien Ernst

分类：机器学习

2022-11-21

In this paper, we identify the best learning scenario to train a team of agents to compete against multiple possible strategies of opposing teams. We evaluate cooperative value-based methods in a mixed cooperative-competitive environment. We restrict ourselves to the case of a symmetric, partially observable, two-team Markov game. We selected three training methods based on the centralised training and decentralised execution (CTDE) paradigm: QMIX, MAVEN and QVMix. For each method, we considered three learning scenarios differentiated by the variety of team policies encountered during training. For our experiments, we modified the StarCraft Multi-Agent Challenge environment to create competitive environments where both teams could learn and compete simultaneously. Our results suggest that training against multiple evolving strategies achieves the best results when, for scoring their performances, teams are faced with several strategies.

translated by 谷歌翻译

Probing neural language models for understanding of words of estimative probability

Damien Sileo , Marie-Francine Moens

分类：自然语言处理 | 人工智能

2022-11-07

Words of estimative probability (WEP) are expressions of a statement's plausibility (probably, maybe, likely, doubt, likely, unlikely, impossible...). Multiple surveys demonstrate the agreement of human evaluators when assigning numerical probability levels to WEP. For example, highly likely corresponds to a median chance of 0.90+-0.08 in Fagen-Ulmschneider (2015)'s survey. In this work, we measure the ability of neural language processing models to capture the consensual probability level associated to each WEP. Firstly, we use the UNLI dataset (Chen et al., 2020) which associates premises and hypotheses with their perceived joint probability p, to construct prompts, e.g. "[PREMISE]. [WEP], [HYPOTHESIS]." and assess whether language models can predict whether the WEP consensual probability level is close to p. Secondly, we construct a dataset of WEP-based probabilistic reasoning, to test whether language models can reason with WEP compositions. When prompted "[EVENTA] is likely. [EVENTB] is impossible.", a causal language model should not express that [EVENTA&B] is likely. We show that both tasks are unsolved by off-the-shelf English language models, but that fine-tuning leads to transferable improvement.

translated by 谷歌翻译

SelecMix: Debiased Learning by Contradicting-pair Sampling

Inwoo Hwang , Sangjun Lee , Yunhyeok Kwak , Seong Joon Oh , Damien Teney , Jin-Hwa Kim , Byoung-Tak Zhang

分类：计算机视觉 | 人工智能 | 机器学习

2022-11-04

Neural networks trained with ERM (empirical risk minimization) sometimes learn unintended decision rules, in particular when their training data is biased, i.e., when training labels are strongly correlated with undesirable features. To prevent a network from learning such features, recent methods augment training data such that examples displaying spurious correlations (i.e., bias-aligned examples) become a minority, whereas the other, bias-conflicting examples become prevalent. However, these approaches are sometimes difficult to train and scale to real-world data because they rely on generative models or disentangled representations. We propose an alternative based on mixup, a popular augmentation that creates convex combinations of training examples. Our method, coined SelecMix, applies mixup to contradicting pairs of examples, defined as showing either (i) the same label but dissimilar biased features, or (ii) different labels but similar biased features. Identifying such pairs requires comparing examples with respect to unknown biased features. For this, we utilize an auxiliary contrastive model with the popular heuristic that biased features are learned preferentially during training. Experiments on standard benchmarks demonstrate the effectiveness of the method, in particular when label noise complicates the identification of bias-conflicting examples.

translated by 谷歌翻译

The Curse of Unrolling: Rate of Differentiating Through Optimization

Damien Scieur , Quentin Bertrand , Gauthier Gidel , Fabian Pedregosa

分类： (统计)机器学习

2022-09-27

计算优化问题解决方案解决方案的雅各布是机器学习中的一个核心问题，其应用程序在超参数优化，元学习，优化为层和数据集蒸馏中的应用程序，仅举几例。展开的分化是一种流行的启发式方法，它使用迭代求解器近似溶液，并通过计算路径区分它。这项工作提供了对梯度下降和Chebyshev方法的二次目标的这种方法的非反应收敛速率分析。我们表明，为了确保雅各布的融合，我们可以1）选择较大的学习率，导致快速渐近地收敛，但接受该算法可能具有任意长的燃烧阶段或2）选择较小的学习率直接但较慢的收敛性。我们将这种现象称为展开的诅咒。最后，我们讨论了相对于这种方法的开放问题，例如为最佳展开策略得出实用的更新规则，并与Sobolev正交多项式领域建立了新的联系。

translated by 谷歌翻译